Towards Learning Coupled Representations for Cross-Lingual Information Retrieval
نویسندگان
چکیده
We explore the use of dictionary-based approaches for cross-lingual information retrieval tasks and propose a novel Coupled Dictionary Learning (CDL) algorithm to learn two separate representations simultaneously for documents in a parallel corpus alongside learning mappings from one representation to the other. We evaluate the performance of the proposed algorithm for the task of comparable document retrieval and compare with existing baselines.
منابع مشابه
Learning a Cross-Lingual Semantic Representation of Relations Expressed in Text
Learning cross-lingual semantic representations of relations from textual data is useful for tasks like cross-lingual information retrieval and question answering. So far, research has been mainly focused on cross-lingual entity linking, which is confined to linking between phrases in a text document and their corresponding entities in a knowledge base but cannot link to relations. In this pape...
متن کاملCross-Lingual Word Representations via Spectral Graph Embeddings
Cross-lingual word embeddings are used for cross-lingual information retrieval or domain adaptations. In this paper, we extend Eigenwords, spectral monolingual word embeddings based on canonical correlation analysis (CCA), to crosslingual settings with sentence-alignment. For incorporating cross-lingual information, CCA is replaced with its generalization based on the spectral graph embeddings....
متن کاملWorkshop LECLIQ: Lessons Learned from Evaluation: Towards Integration and Transparency in Cross-Lingual Information Retrieval with a special Focus on Quality Gates
In this paper we give an overview of the workshop on “Lessons Learned from Evaluation: Towards Integration and Transparency in Cross-Lingual Information Retrieval” that was held in Lisbon, Portugal on May, 30, 2004 in conjunction with the 4 International Conference on Language Resources and Evaluation LREC-04.
متن کاملCross - lingual Information Retrieval Model based on Bilingual Topic Correlation ⋆
How to construct relationship between bilingual texts is important to effectively processing multi-lingual text data and cross language barriers. Cross-lingual latent semantic indexing (CL-LSI) corpus-based doesnot fully take into account bilingual semantic relationship. The paper proposes a new model building semantic relationship of bilingual parallel document via partial least squares (PLS)....
متن کاملCross-lingual information retrieval systems
In this work, we will explore different approaches used in Cross-Lingual Information Retrieval (CLIR) systems. Mainly, CLIR systems which use statistical machine translation (SMT) systems to translate queries into collection language. This will include using SMT systems as a black box or as a white box, also the SMT systems that are tuned towards better CLIR performance. After that, we will pre...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012